Two-Band Excitation for HMM-Based Speech Synthesis

نویسندگان

Sang-Jin Kim

Minsoo Hahn

چکیده

© 2009 Seungho Han et al. 457 ABSTRACT⎯The optimum maximum voiced frequency (MVF) estimation-based two-band excitation for hidden Markov model-based speech synthesis is presented. An analysis-by-synthesis scheme is adopted for the MVF estimation which leads to the minimum spectral distortion of synthesized speech. Experimental results show that the proposed method significantly improves synthetic speech quality.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving Arabic HMM based speech synthesis quality

HMM based speech synthesis, where speech parameters are generated directly from HMM models, is a new technique relative to other speech synthesis techniques. In this paper, we propose some modifications to the basic system to improve its quality. We apply a multi-band excitation model. And we use samples extracted from the spectral envelop as spectral parameters. In the synthesis, the voiced an...

متن کامل

Parameterization of vocal fry in HMM-based speech synthesis

HMM-based speech synthesis offers a way to generate speech with different voice qualities. However, sometimes databases contain certain inherent voice qualities that need to be parametrized properly. One example of this is vocal fry typically occurring at the end of utterances. A popular mixed excitation vocoder for HMM-based speech synthesis is STRAIGHT. The standard STRAIGHT is optimized for ...

متن کامل

Towards an improved modeling of the glottal source in statistical parametric speech synthesis

This paper proposes the use of the Liljencrants-Fant model (LFmodel) to represent the glottal source signal in HMM-based speech synthesis systems. These systems generally use a pulse train to model the periodicity of the excitation signal of voiced speech. However, this model produces a strong and uniform harmonic structure throughout the spectrum of the excitation which makes the synthetic spe...

متن کامل

Sub-band text-to-speech combining sample-based spectrum with statistically generated spectrum

As described in this paper, we propose a sub-band speech synthesis approach to develop a high quality Text-to-Speech (TTS) system: a sample-based spectrum is used in the high-frequency band and spectrum generated by HMM-based TTS is used in the low-frequency band. Herein, sample-based spectrum means spectrum selected from a phoneme database such that it is the most similar to spectrum generated...

متن کامل

Analysis on the Importance of Short-Term Speech Parameterizations for Emotional Statistical Parametric Speech Synthesis

This paper presents a study on the importance of shortterm spectral and excitation parameterizations for emotional hidden Markov model (HMM)-based speech synthesis. The analysis is performed through an emotion classification task by using two methods: K-means emotion clustering and Gaussian Mixture Models (GMMs)based emotion identification. Two known forms of parameterization for the short-term...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

IEICE Transactions

دوره 90-D شماره

صفحات -

تاریخ انتشار 2007

Two-Band Excitation for HMM-Based Speech Synthesis

نویسندگان

چکیده

منابع مشابه

Improving Arabic HMM based speech synthesis quality

Parameterization of vocal fry in HMM-based speech synthesis

Towards an improved modeling of the glottal source in statistical parametric speech synthesis

Sub-band text-to-speech combining sample-based spectrum with statistically generated spectrum

Analysis on the Importance of Short-Term Speech Parameterizations for Emotional Statistical Parametric Speech Synthesis

عنوان ژورنال:

اشتراک گذاری